R Validation Hub

Status Report & Workshop

Doug Kelkhoff

2023-09-18

👋 Who We Are

The R Validation Hub is a collaboration to support the adoption of R within a biopharmaceutical regulatory setting (pharmaR.org)

  • Grew out of R/Pharma 2018
  • Led by participants from ~10 organizations
  • With frequent involvement from health authorities (primarily the FDA)
  • And subscribers from ~60 organizations spanning multiple industries

🤝 Affiliates: PSI/AIMS (CAMIS)

Comparing Analysis Method Implementations in Software
A cross-industry group formed of members from PHUSE, PSI, and ASA.

  • Released a white paper providing guidance on appropriate use of stats methods, for example:
    • Don’t default to the defaults
    • Be specific when drafting analysis plans, including precise methods & options
  • A resource for knowing the details of methods across languages

🤝 Affiliates: PSI/AIMS (CAMIS)

CAMIS Comparisons Resources
Methods R SAS Comparison
Summary Statistics Rounding R SAS R vs SAS
Summary Statistics R SAS R vs SAS

🤝 Affiliates:

Works with and provides support to the R Foundation and to the key organizations developing, maintaining, distributing and using R software

Key Activities

  • The R Validation Hub
  • R Submission Working Group
  • R Repositories Working Group (ie CRAN enhancements & future)

👷‍♂️ What We Do (pharmaR.org)

Products

White Paper

Guidance on compliant use of R and management of packages

New! Repositories

Building a public, validation-ready resource for R packages

Coline Zeballos

New! Communications

Connecting validation experts across the industry

Juliane Manitz

{riskmetric}

Gather and report on risk heuristics to support validation decision-making

Eric Milliman

{riskassessment}

A web interface to {riskmetric}, supporting review, annotation and cataloging of decisions

Aaron Clark

New! {riskscore}

An R data package capturing risk metrics across all of CRAN

Aaron Clark

📊 A Quick Survey

Keep your hand raised if…

  • It’s early morning and you need an excuse to stretch
  • This is your first time hearing about the R Validation Hub
  • You’re missing Andy’s posh accent
  • Your org contributes to the R Validation Hub
  • Your org leverages the R Validation Hub guidelines
  • Your org uses R Validation Hub tools ({riskmetric}, {riskassessment})

🗓️ Agenda

  • Updates 20min
  • Established Workstream Recap 10min
    past, present & future
  • Leaps of Faith: Setting the Tone for our Future 20min
  • Open Discussion
    • What’s Next? 20min
    • Design Lab 10min
  • Closing

📣 Updates

🗝 Key Policy Updates!

If nothing else, take this home!

  • The FDA appears to accept .R files through their eSUB portal1.
  • The FDA has released a draft of a new Computer Software Assurance2 guideline that seems to be increasingly the basis for their evaluation of R.

🗝 Key Policy Updates!

If nothing else, take this home!

Identifying Intended Use 1

Software is used directly for the production and quality systems’ automation inspection, testing, or the collection and processing of production data. Software supports development, monitoring and automated testing. A manufacturer should use a risk-based analysis to determine appropriate assurance activities.

🗝 Key Policy Updates!

If nothing else, take this home!

Determining the Appropriate Assurance Activities1

Assurance can include Ad-hoc testing, Exploratory testing (active package use), Error-guessing (regression testing), Robust scripted testing and Limited scripted testing (traceable, reproducible testing suites).

“This approach may apply scripted testing for high-risk features”

Change of Leadership

  • You may have noticed that I am not Andy Nicholls.
  • Last year, Andy decided to step down to focus on his growing responsibilities as Head of Data Science at GSK

Pulse Check

  • We looked back on how we had been working
  • Identified new opportunities
    1. Refining holistic strategic direction
    2. Be mindful about communication and organization
  • We have a new Communication workstream! (and awesome new slides!)
  • More ways to get involved

📜 Workstream Report

R Validation Hub Case Studies

{riskmetric}

{riskassessment}

📦 Repositories Workstream

Repositories Workstream

Supporting a transparent, open, dynamic, cross-industry approach of establishing and maintaining a repository of R packages.

  • Taking ample time to engage stakeholders
    • Validation leads across the industry
    • Active health authority involvement
    • Analytic environment admins and developers
  • Considering the possibilities
    • Mapping needs to solutions that meet the industry where it is
    • …while building the path for it to move forward

How did we get here?

  • Our whitepaper is widely adopted
  • But implementing it is inconsistent & laborious
    • Variations throughout industry pose uncertainty
    • Sharing software with health authorities is a challenge
    • Health authorities, overwhelmed by technical inconsistencies, are more likely to question software use
  • We feel the most productive path forward is a shared ecosystem

Old dog, new trick

  • Modern package ecosystems are the stats world’s new trick
  • Methods are provided directly by statisticians and academics, rarely by vendors.
  • Risk is managed not by itemized requirements, but by good development practices.1
  • We need to learn how to manage risk in a constantly evolving ecosystem

Different strokes

Vendored Stats Products
Data Science Ecosystem
  • Of-the-shelf cohort.
  • A “snapshot” of living repository.
  • Internal tools developed against cohort packages.
  • Internal tools developed against latest packages.
  • New package versions risk incompatibility.
  • New packages can be reviewed and upgraded at-will.
  • Steep upgrade cost (time, developement).
  • Living ecosystem, constantly vetted against new releases
  • System-specific mix of packages.
  • More likely what is used by HAs

What does a solution look like?

If it’s not broke, don’t fix it!

  • R has this wonderful thing called CRAN, setting the standard of quality
    • Packages are constantly tested together
    • R has a culture of amazing documentation
    • Statisticians flock to R, and are constantly vetting its implementations

What does a solution look like?

Fool me twice, shame on me

  • R has this thorn in its side called CRAN,
    • Builds are difficult to reproduce (key for validation)
    • Quality indicators are lacking
    • Difficult to roll back to an older snapshot (although tools exist to help with this.)
    • Governance isn’t always the most friendly

What does a solution look like?

Closing the CRAN gap for the Pharma Use Case

  • Reproducibility guidelines
  • Standard, public assessment of packages
  • Avenues for communicating about implementations, bugs, security

The Proposal so Far

Repositories Workstream

Work to-date

  1. Stakeholder engagement 3mo
  2. Product refinement and proof-of-concept planning 1mo
  3. POC development 2mo